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REPLAY INSTRUCTION MORPHING 



BACKGROUND 

1. Field 

The present disclosure pertains to the field of processors. More particularly, the 
present disclosure pertains to a processor that may alter, transform, mutate, or otherwise 
"morph" instructions when difficulties are encountered during one or more initial 
attempts to execute such instructions. 

2. Description of Related Art 

Improving the performance of computers or other processing systems generally 
improves overall throughput and/or provides a better user experience. Such improved 
system performance may be achieved by increasing the rate at which instructions for the 
system are processed by a processor. Accordingly, it is desirable to produce advanced 
processors with improved instruction throughput. 

Continuing to increase the performance of a processor, however, is a difficult task. 
Prior art processors already employ techniques of branch prediction, speculative execution, 
and out-of-order (000) execution. Additionally, such processors typically include 
multiple parallel execution units to process numerous instructions in parallel. As 
increasing amounts of parallel hardware are employed, providing sufficient instructions to 
keep this hardware busy becomes increasingly difficult due to limited instruction level 
parallelism which may be extracted or due to instruction dependencies present in many 
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existing software programs. 

Multi-threading is one technique that may be used to reduce idle time for parallel 
execution units. Multi-threading allows multiple programs or threads to share hardware 
resources. Due to the separate program sequences being executed, there is less likelihood 
of instruction dependencies seriously reducing execution unit utilization. Such multi- 
threaded machines inherently benefit from the additional parallelism resulting from 
executmg multiple threads as long as multiple threads can be extracted or are explicitly 
provided by the software being executed. 

Thus, large amounts of effort in designing modem processors have been applied to 
such instruction-dispatch focused techniques. These techniques at least in part strive to 
increase the number of instructions dispatched to the intended execution units. At times, 
however, significant latency-causing problems are encountered post-dispatch (e.g., faults, 
numeric computation problems, cache misses, etc.). An execution unit in a prior art 
processor is generally "stuck" with the instruction it got once the instruction has been 
dispatched to the execution unit. 

Instruction decoding is a type of an alteration of an instruction that occurs after an 
instruction is received by a processor. Instruction decoding, however, generally involves 
expanding an instruction into microinstructions, or changing the encoding of an instruction 
into a more convenient form or another instruction set for execution by an execution unit. 
Instruction decoding does not generally go beyond a particular mapping of an input 
instruction to either individual signals or individual microinstructions. Moreover, 
instruction decoding is an inherently front-end operation in processing systems and lacks 
the ability to incorporate information gleaned throughout execution of an instruction. 
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Thus, prior art processors generally do not morph original instructions into altered 
instructions that execute more efficiently or otherwise differently than the original 
instructions once attempted execution has occurred. 
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Brief Description of the Figures 



The present invention is illustrated by way of example and not limitation in the 
figures of the accompanying drawings. 

Figure la illustrates one embodiment of a processor employing an instruction 
morphing circuit. 

Figure lb illustrates techniques for morphing instructions which may be 
employed by the system of Figure la. 

Figure 2 illustrates one embodiment of a technique for morphing load instructions 
when a cache miss occurs in a cache memory. 

Figures 3a-3d illustrate various embodiments of techniques for dealing with 
instruction dependencies using instruction morphing. 

Figure 4 illustrates another embodiment of a system that utilizes disclosed 
instruction morphing techniques. 

Figure 5 illustrates one embodiment of a technique for handling page faults using 
instruction morphing. 

Figure 6 illustrates one embodiment of a technique for handling indirect 
instructions such as indirect load instructions using instruction morphing. 

Figure 7 illustrates one embodiment of a system that uses instruction morphing in 
conjunction with numerical processing. 

Figure 8 illustrates one embodiment of techniques for handling certain rare data 
dependent mathematical operations. 
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Figure 9 illustrates various design representations or formats for simulation, 
emulation, and fabrication of a design using the disclosed techniques. 
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Detailed Description 

The following description provides replay instruction morphing. In the following 
description, numerous specific details such as types of original and morphed instructions, 
circumstances under which morphing may be appropriate, system environments in which 
morphing may be embodied, execution unit and morphing circuitry interactions, and logic 
partitioning/integration choices are set forth in order to provide a more thorough 
understanding of the present invention. It will be appreciated, however, by one skilled in 
the art that the invention may be practiced without such specific details. In other 
instances, control structures and gate level circuits have not been shown in detail in order 
not to obscure the invention. Those of ordinary skill in the art, with the included 
descriptions, will be able to implement appropriate logic circuits without undue 
experimentation. 

The presently disclosed instruction morphing techniques may advantageously 
allow more efficient execution of instructions in a processing system. By morphing 
certain instructions when particular hardware is unavailable or when proper completion is 
otherwise recognized as not being presently possible, the processor may free resources for 
use in performing other tasks. 

One embodiment of a processor that performs instruction morphing is shown in 
Figure la. The processor of Figure la includes an execution unit 125 which receives 
instructions from a multiplexer 115. A checker 150 is coupled to the execution unit 125 
and determines whether instructions have executed properly. Additional checkers and/or 
execution units may be added in some embodiments. Furthermore, a staging queue (not 
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shown) may receive instructions from the multiplexer 115 and pass the instructions to the 
checker 150 for checking in due course. Properly executed instructions are forwarded on 
to retirement, whereas improperly executed instructions are fed back to the multiplexer 
1 15 for re-execution. 

There are two ways an instruction can be fed back to the execution unit 125 from 
the checker 150. Morphing logic 120 is coupled to receive original instructions from the 
checker 150, and may detect a condition that warrants morphing of the instruction. In 
some cases, however, morphing is inappropriate. Therefore, the morphing logic 120 may 
return either the original instructions or morphed instructions to the multiplexer 115. In 
some embodiments, certain types of instructions or specific instructions may be 
automatically morphed by the morphing logic 120. In other embodiments, certain 
conditions may cause morphing logic 120 to perform morphing operations. Various 
delays may be introduced or conditions tested prior to instructions being returned to the 
execution unit 125. 

Figure lb illustrates two techniques for morphing instructions which may be 
employed by the system of Figure la. At block 160, the processor attempts execution of 
the original instruction. The original instruction is received from the RECEIVED 
INSTRUCTIONS input of the multiplexer 115, and then passed to the execution unit to 
accomplish the execution indicated in block 160. The received instructions may be 
received from various decoding, caching, or other front-end processing logic. 

As indicated in block 165, a problem preventing successful present execution of 
the original instruction is detected. In the embodiment of Figure la, this detection is 
accomplished by the checker 150. If the instruction and/or the conditions indicate that 
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the instruction should be replayed without alteration, the morphing logic 120 may return 
the instruction to the multiplexer 115 without change. 

If the instruction and/or the conditions indicate that the instruction should be 
morphed, the morphing logic 120, as indicated in block 170, alters the instruction so that 

5 it will execute more efficiently or at least differently. Various embodiments of specific 
morphing operations will be discussed below. As indicated in block 175, the morphed 
instruction is then executed. 

In some cases, the morphed instruction is intended to replace the original 
instruction. In this case, as indicated in block 180, retirement of the morphed instruction 

10 completes the execution which was expected from the original instruction. In this case, 
the morphed instruction is a substitute instruction which produces the same results as the 
original instruction; however, the morphed instruction was at the time perceived to be a 
better or more efficient way of achieving those results. 

In other cases, an instruction may be morphed to satisfy a precondition to the 

15 original instruction's successful execution. For example, a memory access may cause a 
page fault to occur. A precondition to proper execution of the memory access is that the 
page fault be resolved. A particular precondition may require numerous morphing 
operations to satisfy. Therefore, as indicated in block 185, the processor checks to 
determine whether the precondition is satisfied by the execution of a morphed instruction, 

20 If not, further morphing may be performed as indicated by the return to block 170. If the 
precondition is satisfied, the original instruction may be restored as indicated in block 
190. Thereafter, the original instruction may be retired as indicated in block 195. In 
some cases, a replacement instruction may be executed instead of the original instruction 
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once the precondition has been satisfied. 

Figure 2 illustrates one embodiment of a technique for morphing load instructions 
when a cache miss occurs in a cache memory. As indicated in block 200, the processor 
determines that a load instruction missed in a second level or above cache. By second 
level, it is meant the second lowest hierarchical cache, regardless of its particular label. 
In the embodiment of Figure la, the checker 150 receives a miss signal from the lowest 
level cache and therefore determines that the load instruction executed improperly. The 
morphing logic 120 receives signals (not shown) indicating that a higher level cache has 
also experienced a cache miss. 

Under these conditions, it may be wasteful to continuously test the higher level 
cache as the load instruction circulates through the replay loop because the needed data 
will be written to both the higher level cache and the lowest level cache when retrieved. 
Therefore, bandwidth of the higher level cache may be saved by, subsequent to the higher 
level cache miss, only attempting to retrieve the data from the lower-level cache. 
Accordingly, as indicated in block 210, the load may be morphed to perform lookups 
only in the lowest level cache in subsequent iterations. 

If valid data is found in the lowest level cache, as tested in block 215, then the 
load will execute properly. The original load may then be retired when the checker 
detects correct execution of the morphed load, as indicated in block 225. If valid data is 
not found in the lowest level cache, the morphed load instruction is replayed as indicated 
in block 220. While the load may continue to unsuccessfully execute a number of times, 
at least it does not wastefuUy consume bandwidth of the higher level cache in the process. 

Figure 3 a illustrates one embodiment of a technique for dealing with instruction 
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dependencies using instruction morphing. As indicated in block 300, a dependent 
instraction and the previous instruction on which depends are identified. In block 310, 
the inability to presently execute the previous instruction is detected. In the embodiment 
of Figure la, block 310 may be accomplished by the checker 150 detecting the erroneous 

5 execution of the previous instruction. 

Since the instruction on which the dependent instruction depends cannot be 
properly executed, it follows that the dependent instruction cannot presently be properly 
executed. Therefore, continuously attempting to execute the dependent instruction may 
be wasteful. Accordingly, as indicated in block 320, the dependent instruction may be 

10 prevented from continuously executing by marketing the instruction as "poisoned". An 
instruction marked as "poisoned" is simply marked in a manner such that the replay 
system and/or the execution units recognize that execution of the instruction should not 
be attempted. For example, a valid bit may be suppressed so that the instruction appears 
to be invalid and therefore will not be executed. This technique may advantageously 

15 reduce the number of unsuccessful attempts at executing a dependent instruction. 

As indicated in block 330, a poison-clearing event is detected. A poison-clearing 
event either specifically indicates that the dependency has been resolved or is an event 
that could have caused the dependency to be resolved. In some embodiments, tracking 
the exact conditions which will cause each individual dependency to be resolved may be 

20 prohibitively expensive. Therefore, common events which may cause particular 
dependencies to be resolved may be used to clear the poison indicators for one or more 
instructions. As indicated in block 330, the dependent instruction is marked as safe (not 
poisoned) for attempted execution. 



42390.P8007 



-11- 



Figure 3b illustrates one embodiment of the operations performed in block 330 of 
Figure 3 a. In this embodiment, the detection of any instruction retiring is performed in 
block 332. As a result of the detection of the retirement of any instruction, all poisoned 
instructions are reset so that execution will again be attempted as indicated in block 334. 
5 This embodiment is relatively inexpensive in terms of the hardware required for 
implementation; however, it may result in some undesirable execution of dependent 
instructions where the dependencies have not yet cleared. 

Figure 3c illustrates another embodiment of the operations performed in block 
330 of Figure 3a. In this embodiment, a write to a lowest level cache is detected in block 
10 336. The detection of this write causes all poisoned bits for instructions to be reset as 
again indicated in block 334. This technique is also convenient in terms of the amount of 
hardware required, but may also result in some unnecessary execution of dependent 
instructions. 

Figure 3d illustrates another embodiment of a technique for dealing with 
15 instraction dependencies using instruction morphing. Figure 3d shares blocks 300, 310, 
and 320 with Figure 3a. After a dependent instruction and a previous instruction on 
which it depends are identified in block 300, however, the embodiment of Figure 3d 
includes an additional operation. As indicated in block 305, the dependent instruction is 
tagged with an identifier that indicates the previous instruction on which it is dependent. 
20 The identifier may be a sequence number of the instruction or any other value that serves 
to identify the previous instruction. Notably, the tagging performed in block 305 may be 
performed in a different sequence than the exact sequence shown in Figure 3d. For 
example, the dependent instructions may not be tagged until after one or both of blocks 
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310 and 320. 

In block 336, the retirement of an instruction is detected. Since dependent 
instructions were earlier tagged with an indication of the instructions on which they 
depend, instruction-specific poison clearing may be performed. In other words, when an 

5 instruction retires, the poison indicators may be reset for only those instructions which 
depend on the retired instruction by comparing any tagged dependent instructions' 
indicators to the corresponding value for the instruction being retired. Thus, as indicated 
in block 338, the poison indication(s) for dependent instruction(s) with indicators that 
indicate the retired instruction are cleared. This technique may advantageously greatly 

10 reduce unnecessary execution of dependent instructions when their correct execution is 
precluded due to the fact that the instruction on which they are dependent has not yet 
been completed. 

Figure 4 illustrates another embodiment of a system that utilizes instruction 
morphing techniques. The embodiment of Figure 4 includes execution logic 425 which 

15 receives instructions from a multiplexer 415. Additionally, a staging queue 410 receives 
instructions from the multiplexer 415. The staging queue 410 stores instructions 
dispatched to the execution logic 425 and passes such instructions on to a checker 450, 
which is also coupled to the execution logic 425, to determine whether the execution 
logic 425 has properly executed the instructions. As was the case in the embodiment of 

20 Figure la, the checker replays improperly executed instructions. Morphing logic 420 
may morph instructions depending on the particular instruction and/or the conditions 
under which it improperly executed. 

Also illustrated in Figure 4 is a page miss handler (PMH) 460 as well as a 
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translation lookaside buffer (TLB) 470 and a memory 480. According to known paging 
techniques, the system stores a number of page table entries in the TLB 470. When a 
page table entry is not found in TLB 470, a page walk is performed by the page miss 
handler 460 to retrieve the page descriptor entry (PDE) and subsequently the page table 

5 entry (PTE) from memory. 

Figure 5 illustrates one embodiment of a technique for handling page faults in the 
system of Figure 4. As indicated in block 500, a page fault producing instruction is 
identified. This may be performed by the morphing logic 420 perhaps with inputs from 
the execution logic 425 and/or the checker 450. As indicated in block 510, the morphing 

10 logic 420 then morphs the page fault producing instruction into a load of a page 
descriptor entry for the page which caused the fault. If instruction morphing were not 
used to introduce the page descriptor entry load into the replay system, another 
instruction may be prevented from executing. 

For example, instead of having the morphing logic 420 perform a morphing 

15 Operation to retrieve the page descriptor entry, the page miss handler could insert a page 
descriptor entry load into the execution stream via the dashed connection 465. This 
newly added instruction would displace another instruction, causing the displaced 
instruction to circulate again through the replay system before it is given a chance to 
execute. Instead, since it is known that the page fault producing instruction can not 

20 successfully execute, it may be more efficient to morph that instruction rather than 
displacing another instruction which could potentially successfully execute in the interim. 

Similarly, after the page descriptor entry load is completed, as indicated in block 
520, the page descriptor entry load may be morphed into a page table entry load (block 
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530). Again, this morphing technique avoids displacing another instruction. As indicated 
in block 540, the page table entry load completes, and the instruction may be morphed 
back into the original page fault producing instruction (block 550). This instruction may 
now execute without causing a page fault, with the page fault being resolved without 
displacing other operations in the replay system. 

Figure 6 illustrates one embodiment of a technique for handling indirect 
instructions. Indirect instructions such as MOV EBX, MEM[EAX] involve two 
retrievals. First, the value of EAX must be retrieved in order to find the address of the 
data which is requested to be loaded into EBX. Second, the actual memory access to the 
address (the contents of EAX) of the requested data is performed. If the instruction is 
decoded into multiple microoperations, then additional resources are consumed. Thus, it 
may be advantageous to have indirect instructions which are not decoded into multiple 
microoperations, but rather which are morphed to perform the proper operations. 

Accordingly, the technique shown in Figure 6 may be used to allow a single 
microoperation to accomplish indirect addressing. In block 600, an indirect memory 
reference instruction is identified. Indirect memory referencing techniques and 
instructions are well known and will not be further discussed herein. This technique may 
be used for a variety of indirect or similar addressing techniques which implicitly require 
multiple memory or register accesses or a combination of memory and register accesses. 

In block 610, the instruction is morphed into an altered instruction which loads 
the address of the requested data. In the above example (MOV EBX, MEM[EAX]), the 
value of EAX would be loaded and is received by the memory execution unit as indicated 
in block 620. Next, the instruction is morphed into a load of the requested data as 
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indicated in block 630, and a load of the memory location that was indicated by the EAX 
register is performed. Accordingly, a single instruction slot may be used to perform 
indirect or similar memory access techniques. 

Figure 7 illustrates an embodiment of replay instruction morphing that deals with 

5 numerical computations. In the embodiment of Figure 7, an over-precise or high 
precision case is handled specially by the replay system. An over-precise or high 
precision case may be either an instruction or a particular data-dependent case which 
requires additional cycles or hardware to compute a result to the desired precision. Thus, 
the need for additional resources may be due to the precision requested by the instruction 

10 or the particular numbers involved. 

In block 700, the over-precise or high precision case is detected. Instead of 
attempting to compute the final result, the execution unit computes an intermediate result 
as indicated in block 710. The instruction is morphed, as indicated in block 715 and then 
tagged as an over-precise replay. Next, the morphed or altered instruction using the 

15 intermediate result is executed, as indicated in block 720. The final result is placed into 
the proper destination location as indicated in block 725. 

There may be several reasons why the computation of only an intermediate result 
is advantageous. In some cases, it may be possible to use simpler hardware that cannot 
compute results for all input data in the same number of cycles. Typically, some rare 

20 cases require significant additions to hardware to ensure proper handling in the same time 
frame as other numbers. In such cases, the correct final result may be obtained via replay, 
and hardware may be saved. Additionally, some different higher precision instructions 
may advantageously be handled in a manner similar to lower precision instructions, 
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except that they are passed back through the replay system to compute final and 
sufficiently accurate results. 

Similarly, Figure 8 illustrates one embodiment of techniques for handling certain 
rare data dependent mathematical operations. In the embodiment of Figure 8, 
substitutions of instructions and/or operands may be performed to advantageously 
simplify hardware. Again, hardware otherwise needed to handle difficult and rare cases 
may be eliminated, resulting in a more compact part, with only rare impacts to 
performance. 

As indicated in block 800, a data dependent computationally intensive or 
hardware intensive mathematical operation is detected. For example, certain round 
instructions are very computationally intensive and therefore require significant amounts 
of hardware. In block 810, the instruction is morphed into a less hardware and/or 
computation intensive operation. For example, a round operation may be morphed into 
an add instruction or a subtract instruction, depending on the exact operand involved. 

Finally, the substitute operation is executed to produce the identical result as 
indicated in block 820. The result is "identical" to the user in that, to the precision 
requested, the result produced by the numerical execution unit is the same as would be 
produced if the original instruction had been performed. Thus, the user may be unaware 
that an add was performed instead of a round, but the execution unit itself may be 
simplified so that it need not handle rare and difficult cases. 

Figure 9 illustrates various design representations or formats for simulation, 
emulation, and fabrication of a design using the disclosed techniques. Data representing 
a design may represent the design in a number of manners. First, as is useful in 
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simulations, the hardware may be represented using a hardware description language or 
another functional description language which essentially provides a computerized model 
of how the designed hardware is expected to perform. The hardware model 910 may be 
stored in a storage medium 900 such as a computer memory so that the model may be 
5 simulated using simulation software 920 that applies a particular test suite 930 to the 
hardware model 910 to determine if it indeed functions as intended. In some 
embodiments, the simulation software is not recorded, captured, or contained in the 
medium. 

Additionally, a circuit level model with logic and/or transistor gates may be 
10 produced at some stages of the design process. This model may be similarly simulated, 
sometimes by dedicated hardware simulators that form the model using programmable 
logic. This type of simulation, taken a degree further, may be an emulation technique. In 
any case, re-configurable hardware is another embodiment that may involve a machine 
readable medium storing a model employing the disclosed techniques. 
15 Furthermore, most designs, at some stage, reach a level of data representing the 

physical placement of various devices in the hardware model. In the case where 
conventional semiconductor fabrication techniques are used, the data representing the 
hardware model may be the data specifying the presence or absence of various features on 
different mask layers for masks used to produce the integrated circuit. Again, this data 
20 representing the integrated circuit embodies the techniques disclosed in that the circuitry or 
logic in the data can be simulated or fabricated to perform these techniques. 

In any representation of the design, the data may be stored in any form of a 
computer readable medium. An optical or electrical wave 960 modulated or otherwise 

42390.P8007 -18- 



generated to transmit such information, a memory 950, or a magnetic or optical storage 
940 such as a disc may be the medium. The set of bits describing the design or the 
particular part of the design are an article that may be sold in and of itself or used by others 
for further design or fabrication. 

Thus, replay instruction morphing is disclosed. While certain exemplary 
embodiments have been described and shown in the accompanying drawings, it is to be 
understood that such embodiments are merely illustrative of and not restrictive on the 
broad invention, and that this invention not be limited to the specific constructions and 
arrangements shown and described, since various other modifications may occur to those 
ordinarily skilled in the art upon studying this disclosure. 
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What is claimed is: 



1 L An apparatus comprising: 

2 an execution unit to execute an instruction; 

3 a replay system to replay an altered instruction if the execution unit executes 

4 the instruction erroneously. 

1 2. The apparatus of claim 1 wherein the replay system comprises: 

2 a replay loop to replay the instruction under a first condition; and 

3 an instruction morphing circuit to replay the altered instruction under a second 

4 condition. 

1 3. The apparatus of claim 1 wherein the replay system comprises: 

2 a replay loop to replay the instruction if the instruction is a first instruction; 

3 and 

4 an instruction morphing circuit to replay the altered instruction if the 

5 instruction is a second instruction. 

1 4. The apparatus of claim 3 wherein the first instruction is one of a pluraUty of non- 

2 modifiable instructions and the second instruction is one of a plurality of modifiable 

3 instructions. 

1 5. The apparatus of claim 4 wherein the plurality of modifiable instructions are morphed 

2 only if a failure in their initial execution occurs. 
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1 6. The apparatus of claim 1 wherein said replay system tracks at least one extra bit to 

2 allow alterations of instructions, 

1 7, The apparatus of claim 1 wherein said apparatus comprises a low level cache and a 

2 higher level cache, wherein the replay system is to alter a load instruction that has 

3 already missed in the higher level cache to thereafter only access the low level cache. 

1 8. The apparatus of claim 1 wherein said apparatus comprises a page miss handler to 

2 handle instructions that cause page faults, wherein the instruction is a memory access 

3 that causes a page fault, and wherein the replay system is to change the memory 

4 access to one or more memory accesses to handle the page fault. 

1 9. The apparatus of claim 8 wherein the replay system is to replace the memory access 

2 with a page descriptor read, then to replace said page descriptor read with a page table 

3 entry read, then to reinstate the memory access. 

1 10. The apparatus of claim 1 wherein said instruction is a dependent instruction that is 

2 dependent on a result from a previous instruction, and wherein the replay system is to 

3 alter the dependent instruction to avoid execution in further iterations through the 

4 replay system until the previous instruction has successfully executed. 

1 11. The apparatus of claim 10 wherein the replay system is to alter the dependent 

2 instruction by setting a vaUd bit for the dependent instruction to indicate that the 
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3 instruction is invalid. 

1 12. The apparatus of claim 10 wherein the replay system is to alter the dependent 

2 instruction back into an executable form when said previous instruction retires, 

1 13. The apparatus of claim 1 1 wherein the replay system is to reset the valid bit when any 

2 instruction retires. 

1 14. The apparatus of claim 10 wherein the replay system is to track a sequence number 

2 for the previous instruction and wherein the replay system is to return the dependent 

3 instruction to an executable form when said previous instruction completes. 

1 15. The apparatus of claim 10 wherein the apparatus further includes a cache, and 

2 wherein the replay system is to return the dependent instruction to an executable form 

3 when a write to the cache occurs. 

1 16. The apparatus of claim 1 wherein said instruction is a high precision instruction and 

2 said replay system is to generate a first result and then the altered instruction is to be 

3 executed to generate a final result from the first result. 

1 17. The apparatus of claim 1 wherein said execution unit is a numeric execution unit and 

2 wherein said replay system is to a detect data dependent condition for the instruction 

3 and to provide the altered instruction to achieve an identical result. 
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1 18. The apparatus of claim 17 wherein the instruction is a rounding instruction and the 

2 altered instruction is an add instruction. 

1 19. The apparatus of claim 17 wherein the numeric execution unit lacks hardware to 

2 compute one or more relatively rare numeric cases and wherein such relatively rare 

3 numeric cases are instead implemented by injecting, via the replay system, the altered 

4 instruction to achieve an effectively identical result. 

1 20. A processor comprising: 

2 a scheduler to dispatch an original instruction; 

3 an execution unit to attempt execution of the original instruction; 

4 a checker to determine whether the original instruction executed properly; 

5 a replay system comprising: 

6 a replay loop to replay the original instruction; 

7 a morphing circuit to change the original instruction into an altered 

8 instruction and to replay the altered instruction. 

1 21. The processor of claim 20 wherein said replay system is coupled to replay the original 

2 instruction when a first condition occurs and to replay the altered instruction when a 

3 second condition occurs, 

1 22. The processor of claim 20 wherein said replay system is coupled to replay the original 

2 instruction when the original instruction is a first instruction and to replay the altered 
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3 instruction when the original instruction is a second instruction. 

1 23. The processor of claim 22 wherein said first instruction is one of a plurality of non- 

2 alterable instructions and wherein said second instruction is one of a plurality of 

3 alterable instructions. 

1 24. A method comprising: 

2 executing an original instruction; 

3 determining if a first condition occurs; 

4 if said first condition occurs, then 

5 morphing said original instruction to form a morphed instruction; and 

6 executing said morphed instruction. 

1 25. The method of claim 24 wherein determining if the first condition occurs further 

2 comprises: 

3 determining whether the original instruction executed improperly. 

1 26. The method of claim 24 further comprising: 

2 determining whether a second condition occurs; 

3 if said second condition occurs, then 

4 replaying said original instruction for execution. 

1 27. The method of claim 24 wherein morphing comprises: 
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2 altering a load instruction that has already missed in a higher level cache to 

3 thereafter only access a lower level cache, 

1 28. The method of claim 24 wherein morphing comprises: 

2 altering a page-fault-causing instruction to perform one or more other 

3 instructions to handle a page fault. 

1 29. An article comprising a machine readable medium that stores data representing an 

2 integrated circuit comprising: 

3 an execution unit to execute an instruction; 

4 a replay system to replay an altered instruction if the execution unit executes 

5 the instruction erroneously. 

1 30. The article of claim 29 storing further data representing the integrated circuit, which 

2 further comprises: 

3 a replay loop to replay the instruction under a first condition; and 

4 an instruction morphing circuit to replay the altered instruction under a second 

5 condition. 

1 31. The article of claim 29 wherein the data representing the integrated circuit comprises 

2 a functional description of the integrated circuit, 

1 32. The article of claim 29 wherein the data representing the integrated circuit comprises 
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2 a hardware description language code. 

1 33. The article of claim 29 wherein the data representing the integrated circuit comprises 

2 data representing a plurality of mask layers string physical data representing the 

3 presence or absence of material at various locations of each of said plurality of mask 

4 layers. 

34. An article comprising a machine readable carrier medium having stored thereon data 
which, when loaded into a computer system memory in conjunction with simulation 
routines, provides functionality of a model comprising: 

an execution unit to execute an instruction; 

a replay system to replay an altered instruction if the execution unit executes 
the instruction erroneously. 

35. The article of claim 34 wherein the model further comprises: 
a replay loop to replay the instruction under a first condition; and 
an instruction morphing circuit to replay the altered instruction under a second 

condition. 





2 
3 
4 
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Abstract 



Replay instruction morphing. One disclosed apparatus includes an execution unit 
to execute an instruction. A replay system replays an altered instruction if the execution 
unit executes the instruction erroneously. 
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Attorney's Docket No.: 042390. P8007 Patent 
DECLARATION AND POWER OF ATTORNEY FOR PATENT APPLICATION 



As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below, next to my name. 

I believe I am the original, first, and sole inventor (if only one name is listed below) or an original, 
first, and joint inventor (if plural names are listed below) of the subject matter which is claimed and 
for which a patent is sought on the invention entitled 

REPLAY INSTRUCTION MORPHING 



the specification of which 

X is attached hereto. 

was filed on as 

United States Application Number 

or PCT International Application Number 

and was amended on . 

(if applicable) 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claim(s), as amended by any amendment referred to above. I do not 
know and do not believe that the claimed invention was ever known or used in the United States of 
America before my invention thereof, or patented or described in any printed publication in any 
country before my invention thereof or more than one year prior to this application, that the same 
was not in public use or on sale in the United States of America more than one year prior to this 
application, and that the invention has not been patented or made the subject of an inventor's 
certificate issued before the date of this application in any country foreign to the United States of 
America on an application filed by me or my legal representatives or assigns more than twelve 
months (for a utility patent application) or six months (for a design patent application) prior to this 
application. 

I acknowledge the duty to disclose all information known to me to be material to patentability as 
defined in Title 37, Code of Federal Regulations, Section 1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, Section 119(a)-(d), of any 
foreign application(s) for patent or inventor's certificate listed below and have also identified below 
any foreign application for patent or inventor's certificate having a filing date before that of the 
application on which priority is claimed: 
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Priority 



Prior Foreian ADDlication(s) 






Claimed 


(Number) 


(Country) 


(Day/MonthA'ear Filed) 


Yes No 


(Number) 


(Country) 


(Day/MontlT/Year Filed) 


Yes No 


(Number) 


(Country) 


(Day/MonthA'ear Filed) 


Yes No 



I hereby claim the benefit under title 35, United States Code, Section 1 19(e) of any United States 
provisional application(s) listed below: 



(Application Number) Filing Date 



(Application Number) Filing Date 



I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States 
applicatlon(s) listed below and, insofar as the subject matter of each of the claims of this application 
is not disclosed in the prior United States application in the manner provided by the first paragraph 
of Title 35, United States Code, Section 1 12, 1 acknowledge the duty to disclose all information 
known to me to be material to patentability as defined in Title 37, Code of Federal Regulations, 
Section 1.56 which became available between the filing date of the prior application and the national 
or PCT international filing date of this application: 



(Application Number) Filing Date (Status patented, 

pending, abandoned) 



(Application Number) Filing Date (Status patented, 

pending, abandoned) 

I hereby appoint the persons listed on Appendix A hereto (which is incorporated by reference and a 
part of this document) as my respective patent attorneys and patent agents, with full power of 
substitution and revocation, to prosecute this application and to transact all business in the Patent 
and Trademark Office connected herewith. 

Send correspondence to Jeffrey S, Draeger , BLAKELY, SOKOLOFF, TAYLOR & 

(Name of Attorney or Agent) 
ZAFMAN LLP, 12400 Wilshire Boulevard 7th Floor, Los Angeles, California 90025 and direct 

teleplione calls to Jeff rev S, Draeger , (408) 720-8300. 

(Name of Attorney or Agent) 



Rev. 06/27/00 (D1) 



-2- 



I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on Information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 

Full Name of Sole/First Inventor Douglas M. Carmean 



Inventor's Signature Date 

Residence Beaverton, Oregon Citizenship USA 

(City, State) (Country) 

Post Office Address 14815 SW Bonnie Brae 

Beaverton. Oregon 97007 



Full Name of Second/Joint Inventor David J. Saqer 



Inventor's Signature „ Date 

Residence Portland, Oregon Citizenship USA 

(City, State) (Country) 

Post Office Address 9450 N.W. Skvview Drive .„ 

Portland. Oregon 97231 



Full Name of Third/Joint Inventor Thomas F. Toll 



Inventor's Signature Date 

Residence Portland. Oregon Citizenship USA 

(City. State) (Country) 

Post Office Address 1517 SW Montgomery Street 

Portland. Oregon 97201 



Full Name of Fourth/Joint Inventor Karol F. Menezes 

Inventor's Signature Date 

Residence Portland. Oregon Citizenship USA 

(City, State) (Country) 

Post Office Address 5365 NW Lianna Wav 

Portland. Oregon 97212 
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APPENDIX A 



William E. Alford, Reg. No. 37,764; Farzad E. Amini, Reg. No. P42,261 ; Aloysius T. C. AuYeung, Reg. No. 
35,432; William Thomas Babbitt, Reg. No. 39,591; Carol F. Barry, Reg. No. 41,600; Jordan Michael 
Becker, Reg, No. 39,602; Lisa N. Benado, Reg. No. 39,995; Bradley J. Bereznak, Reg. No. 33,474; 
Michael A. Bernadicou, Reg. No. 35,934; Roger W. Blakely, Jr., Reg. No. 25,831; R. Alan Burnett, Reg. 
No. 46,149; Gregory D. Caldwell, Reg. No. 39,926; Andrew C. Chen, Reg. No. 43,544; Thomas M. 
Coester, Reg. No. 39,637; Donna Jo Coningsby, Reg. No. 41,684; Florin Corie, Reg. No. 46,244; Dennis 
M. deGuzman, Reg. No. 41,702; Stephen M. De Klerk, Reg. No. P46,503; Michael Anthony DeSanctis, 
Reg. No. 39,957; Daniel M. De Vos, Reg. No. 37,813; Robert Andrew Diehl, Reg. No. 40,992; Sanjeet 
Dutta, Reg. No. P46,145; Matthew C. Fagan, Reg. No. 37,542; Tarek N. Fahmi, Reg. No. 41,402; George 
Fountain, Reg. No. 37,374; Paramita Ghosh, Reg. No. 42,806; James Y. Go, Reg. No. 40,621; James A. 
Henry, Reg. No. 41,064; Libby N. Ho, Reg. No. P46,774; Willmore F. Holbrow 111, Reg. No. P41,845; 
Sheryl Sue Holloway, Reg. No. 37,850; George W Hoover II, Reg. No. 32,992; Eric S. Hyman, Reg. No. 
30,139; William W. Kidd, Reg. No. 31,772; Sang Hui Kim, Reg. No. 40,450; Walter T. Kim, Reg. No. 
42,731; Eric T. King, Reg. No. 44,188; Erica W. Kuo, Reg. No. 42,775; George Brian Leavell, Reg. No. 
45,436; Kurt P. Leyendecker, Reg. No. 42,799; Gordon R. Lindeen 111, Reg. No. 33,192; Jan Carol Little, 
Reg. No. 41,181; Joseph Lutz, Reg. No. 43,765; Michael J. Mallie, Reg. No. 36,591; Andre L Marais, 
under 37 C.F.R. § 10.9(b); Paul A. Mendonsa, Reg. No, 42,879; Clive D. Menezes, Reg. No. 45,493; 
Chun M. Ng, Reg. No. 36,878; Thien T. Nguyen, Reg. No. 43,835; Thinh V. Nguyen, Reg. No. 42,034; 
Dennis A. Nicholls, Reg. No. 42,036; Daniel E. Ovanezian, Reg, No. 41,236; Kenneth B. Paley, Reg. No. 
38,989; Marina Portnova, Reg. No. P45,750; William F. Ryann, Reg. 44,313; James H. Salter, Reg. No. 
35,668; William W. Schaal, Reg. No. 39,018; James C. Scheller, Reg, No. 31,195; Jeffrey Sam Smith, 
Reg. No. 39,377; Maria McCormack Sobrino, Reg. No. 31,639; Stanley W. Sokoloff, Reg. No. 25,128; 
Judith A. Szepesi, Reg. No. 39,393; Vincent P. Tassinari, Reg. No. 42,179; Edwin H. Taylor, Reg. No. 
25,129; John F. Travis, Reg. No. 43,203; Joseph A. Twarowski, Reg. No, 42,191; Tom Van Zandt, Reg. 
No, 43,219; Lester J. Vincent, Reg. No. 31,460; Glenn E. Von Tersch, Reg, No. 41,364; John 
Patrick Ward, Reg. No. 40,216; Mark L. Watson, Reg. No. P46,322; Thomas C. Webster, Reg. No. 
P46,154; Steven D. Yates, Reg. No. 42,242; and Norman Zafman, Reg. No. 26,250; my patent attorneys, 
and Firasat All, Reg, No. 45,715; and Justin M. Dillon, Reg. No. 42,486; my patent agents, of BLAKELY, 
SOKOLOFF, TAYLOR & ZAFMAN LLP, with offices located at 12400 Wilshire Boulevard, 7th Floor, 
Los Angeles, California 90025, telephone (310) 207-3800, and James R. Thein, Reg. No. 31,710, my 
patent attorney with full power of substitution and revocation, to prosecute this application and to transact 
all business in the Patent and Trademark Office connected herewith. 
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APPENDIX B 



Title 37, Code of Federal Regulations, Section 1 .56 
Duty to Disclose Information Material to Patentability 

(a) A patent by its very nature is affected with a public interest. The public interest is best served, 
and the most effective patent examination occurs when, at the time an application is being examined, the 
Office is aware of and evaluates the teachings of all information material to patentability. Each individual 
associated with the filing and prosecution of a patent application has a duty of candor and good faith in 
dealing with the Office, which includes a duty to disclose to the Office all information known to that individual 
to be material to patentability as defined in this section. The duty to disclosure information exists with respect 
to each pending claim until the claim is cancelled or withdrawn from consideration, or the application becomes 
abandoned. Information material to the patentability of a claim that is cancelled or withdrawn from 
consideration need not be submitted if the information is not material to the patentability of any claim 
remaining under consideration in the application. There is no duty to submit information which is not material 
to the patentability of any existing claim. The duty to disclosure all information known to be material to 
patentability is deemed to be satisfied if all information known to be material to patentability of any claim 
issued in a patent was cited by the Office or submitted to the Office in the manner prescribed by §§1 .97(b)-(d) 
and 1 .98. However, no patent will be granted on an application in connection with which fraud on the Office 
was practiced or attempted or the duty of disclosure was violated through bad faith or intentional misconduct. 
The Office encourages applicants to carefully examine: 

(1 ) Prior art cited in search reports of a foreign patent office in a counterpart application, and 

(2) The closest information over which individuals associated with the filing or prosecution of a 
patent application believe any pending claim patentably defines, to make sure that any material information 
contained therein is disclosed to the Office. 

(b) Under this section, information is material to patentability when it is not cumulative to 
information already of record or being made or record in the application, and 

(1 ) It establishes, by itself or in combination with other information, a prima facie case of 
unpatentability of a claim; or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 

(i) Opposing an argument of unpatentability relied on by the Office, or 

(il) Asserting an argument of patentability. 

A prima facie case of unpatentability is established when the information compels a conclusion that a claim is 
unpatentable under the preponderance of evidence, burden-of-proof standard, giving each term in the claim 
its broadest reasonable construction consistent with the specification, and before any consideration is given to 
evidence which may be submitted in an attempt to establish a contrary conclusion of patentability. 

(c) Individuals associated with the filing or prosecution of a patent application within the 
meaning of this section are: 

(1) Each inventor named in the application; 

(2) Each attorney or agent who prepares or prosecutes the application; and 

(3) Every other person who is substantively involved in the preparation or prosecution of the 
application and who is associated with the inventor, with the assignee or with anyone to whom there is an 
obligation to assign the application. 

(d) Individuals other than the attorney, agent or inventor may comply with this section by 
disclosing information to the attorney, agent, or inventor. 
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